Vocal Melody Extraction via HRNet-Based Singing Voice Separation and Encoder-Decoder-Based F0 Estimation
نویسندگان
چکیده
Vocal melody extraction is an important and challenging task in music information retrieval. One main difficulty that, most of the time, various instruments singing voices are mixed according to harmonic structure, making it hard identify fundamental frequency (F0) a voice. Therefore, reducing interference accompaniment beneficial pitch estimation In this paper, we first adopted high-resolution network (HRNet) separate vocals from polyphonic music, then designed encoder-decoder estimate vocal F0 values. Experiment results demonstrate that effectiveness HRNet-based voice separation method on melody, proposed (VME) system outperforms other state-of-the-art algorithms cases.
منابع مشابه
Singing Voice Separation Using Deep Neural Networks and F0 Estimation
Deep Neural Networks (DNN) have become a popular approach for speech enhancement, and singing voice separation. DNNs are typically trained to estimate a timefrequency mask using ground truth examples. In this submission, we combine DNN estimation as a first step with traditional refinement via F0 estimation, using the YINFFT algorithm.
متن کاملImproving accompanied Flamenco singing voice transcription by combining vocal detection and predominant melody extraction
While recent approaches to automatic voice melody transcription of accompanied flamenco singing give promising results regarding pitch accuracy, mistakenly transcribed guitar sections represent a major limitation for the obtained overall precision. With the aim of reducing the amount of false positives in the voicing detection, we propose a fundamental frequency contour estimation method which ...
متن کاملTransferring Vocal Expression of F0 Contour Using Singing Voice Synthesizer
A system for transferring vocal expressions separately from singing voices with accompaniment to singing voice synthesizers is described. The expressions appear as fluctuations in the fundamental frequency contour of the singing voice, such as vibrato, glissando, and kobushi. The fundamental frequency contour of the singing voice is estimated using the subharmonic summation in a limited frequen...
متن کاملSinging Voice Separation Based on Non-vocal Independent Component Subtraction and Amplitude Discrimination
Many applications of Music Information Retrieval can benefit from effective isolation of the music sources. Earlier work by the authors led to the development of a system that is based on Azimuth Discrimination and Resynthesis (ADRess) and can extract the singing voice from reverberant stereophonic mixtures. We propose an extension to our previous method that is not based on ADRess and exploits...
متن کاملClassification-Based Singing Melody Extraction Using Deep Convolutional Neural Networks
Singing melody extraction is the task that identifies the melody pitch contour of singing 1 voice from polyphonic music. Most of the traditional melody extraction algorithms are based on 2 calculating salient pitch candidates or separating the melody source from the mixture. Recently, 3 classification-based approach based on deep learning has drawn much attentions. In this paper, 4 we present a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronics
سال: 2021
ISSN: ['2079-9292']
DOI: https://doi.org/10.3390/electronics10030298